Modeling Skewed Class Distributions by Reshaping the Concept Space
نویسندگان
چکیده
We introduce an approach to learning from imbalanced class distributions that does not change the underlying data distribution. The ICC algorthm decomposes majority classes into smaller subclasses that create a more balanced class distribution. In this paper, we explain how ICC can not only address the class imbalance problem but may also increase the expressive power of the hypothesis space. We validate ICC and analyze alternative decomposition methods on well-known machine learning datasets as well as new problems in pervasive computing. Our results indicate that ICC performs as well or better than existing approaches to handling class imbalance.
منابع مشابه
Using Weighted Distributions for Modeling Skewed, Multimodal and Truncated Data
When the observations reflect a multimodal, asymmetric or truncated construction or a combination of them, using usual unimodal and symmetric distributions leads to misleading results. Therefore, distributions with ability of modeling skewness, multimodality and truncation have been in the core of interest in statistical literature, always. There are different methods to contract ...
متن کاملCADERNOS DE COMPUTAÇÃO XX (2003) Learning with Skewed Class Distributions
Several aspects may influence the performance achieved by a classifier created by a Machine Learning system. One of these aspects is related to the difference between the numbers of examples belonging to each class. When this difference is large, the learning system may have difficulties to learn the concept related to the minority class. In this work, we discuss several issues related to learn...
متن کاملA General Framework for Mining Concept-Drifting Data Streams with Skewed Distributions
In recent years, there have been some interesting studies on predictive modeling in data streams. However, most such studies assume relatively balanced and stable data streams but cannot handle well rather skewed (e.g., few positives but lots of negatives) and stochastic distributions, which are typical in many data stream applications. In this paper, we propose a new approach to mine data stre...
متن کاملMixtures of skewed Kalman filters
Normal state-space models are prevalent, but to increase the applicability of the Kalman filter, we propose mixtures of skewed, and extended skewed, Kalman filters. To do so, the closed skew-normal distribution is extended to a scalemixture class of closed skew-normal distributions. Some basic properties are derived and a class of closed skew-t distributions is obtained. Our suggested family of...
متن کاملEconomic design of x¯ control charts considering process shift distributions
Process shift is an important input parameter in the economic design of control charts. Earlier x control chart designs considered constant shifts to occur in the mean of the process for a given assignable cause. This assumption has been criticized by many researchers since it may not be realistic to produce a constant shift whenever an assignable cause occurs. To overcome this difficulty...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017